PCOS analysis

Agnes Lorenzen, Cecille Hobbs, Freja E. Klippmann, Julie Dalgaard Petersen & Mille Rask Sander

Introduction

Background

  • Polycystic ovary syndrome (PCOS) is a syndrome documented in women in their menustruating ages

  • Documented symptoms are often; period pains, irregular periods, ovary related problems and hormone imbalance

  • Patients with PCOS often have problems with pregnancy and potential complication with/in pregnancy

  • However, it is still not verified what the cause of PCOS is.

Aim

The aim of this study is to examine a data set (found on Kaggle) of patients with and without PCOS. The data set has been made in India and data comes from 10 different hospitals.

Data handling approach (attempt 1)

The raw data we received contained: 541 observation divided into 45 variables

02 Clean data 03 Augment data
  • Fixing random cells and replacing them with NA
  • Rename & factorizing columns

  • Split data frame into body and blood measurements

  • Removed empty column

  • Unit changes ( inch to cm)

  • Rounding BMI

  • Grouping & BMI

  • Change Blood type and cycles from numeric values to characters

  • Create new column for cycle/ pregnancy stage

  • Merging data frame into one file

Attempt 2 on data handling

02 Clean data 03 Augment data
    • Fixing random cells and replacing them with NA

    • Rename & factorizing columns

    • Split data frame into body and blood measurements

    • Removed empty column

    • Unit changes ( inch to cm)

    • Rounding BMI

    • Grouping & BMI

    • Change Blood type and cycles from numeric values to characters

    • Create new column for cycle/ pregnancy stage

    • Merging data frame into one file

Descriptive analysis of data

Left Column

# A tibble: 2 × 1
  `PCOS dimensions`
              <int>
1               541
2                44
# A tibble: 2 × 2
  PCOS_diagnosis     n
  <chr>          <int>
1 No               364
2 Yes              177

Right Column

knitr::include_graphics(“../results/04_age_hist.png”, out.width = “50%”)

Analysis 1

hh

hh

Analysis 2

her

PCA of blood measurements

![](../results/07_blood_PCA.png){.absolute bottom=0 left=0 width=“500” height=“330}

PCA of body measurements

her

Discussion

her

Conclusion

  • no significance